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Abstract. In the standard picture of structure formation, initially random- phase 
fluctuations are amplified by non-linear gravitational instability to produce a final 
distribution of mass which is highly non- Gaussian and has highly coupled Fourier 
phases. Second-order statistics, such as the power spectrum, are blind to this kind 
of phase association. We discuss the information contained in the phases of cos- 
mological density fluctuations and their possible use in statistical analysis tools. In 
particular, we show how the bispectrum measures a particular form of phase as- 
sociation called quadratic phase coupling, show how to visualise phase association 
using colour models. These techniques offer the prospect of more complete tests of 
initial non-Gaussianity than those available at present. 



1 Introduction 

The local Universe displays a rich hierarchical pattern of galaxy clusters and 
superclusters [Q. The early Universe, however, was almost smooth, with only 
slight ripples seen in the cosmic microwave background radiation Q . Models 
of the evolution of structure link these observations through the effect of grav- 
ity, because the small initially overdense fluctuations attract additional mass 
as the Universe expands [||. During the early stages, the ripples evolve inde- 
pendently, like linear waves on the surface of deep water. As the structures 
grow in mass, they interact with other in non-linear ways, more like waves 
breaking in shallow water. Cosmic structure can be characterized by phase 
correlations associated with these non-linear interactions, but this informa- 
tion is missed by standard analysis techniques such as the power spectrum. 
In order to do justice to the large data sets about to become available, it is 
important to design techniques sensitive to the fine details of cosmic structure 
they will reveal. Here we report a method of quantifying phase information 
and suggest how this information may be exploited to build novel statis- 
tical descriptors that can be used to mine the sky more effectively than with 
standard methods. 
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2 Fourier Description of Cosmological Density Fields 

In most popular versions of the "gravitational instability" model for the origin 
of cosmic structure, particularly those involving cosmic inflation ||, the initial 
fluctuations that seeded the structure formation process form a Gaussian 
random field || . Deviations from uniformity, expressed in terms of the density 
contrast <5(x) defined by <5(x) = [p(x) — po] / po, where p is the average density 
and p(x) is the local matter density. Because the initial perturbations evolve 
linearly, it is useful to expand <5(x) as a Fourier superposition of plane waves: 

<5(x) =^<5(k)exp(ik-x). (1) 

The Fourier transform 5(k) is complex and therefore possesses both ampli- 
tude |^(k)| and phase 0k where 

5(k) = |l(k)|exp(# k ). (2) 

Gaussian random fields possess Fourier modes whose real and imaginary parts 
are independently distributed. In other words, they have phase angles 4>k 
that are independently distributed and uniformly random on the interval 
[0, 2n]. When fluctuations are small, i.e. during the linear regime, the Fourier 
modes evolve independently and their phases remain random. In the later 
stages of evolution, however, wave modes begin to couple together ||. In this 
regime the phases become non-random and the density field becomes highly 
non-Gaussian. Phase coupling is therefore a key consequence of nonlinear 
gravitational processes if the initial conditions are Gaussian and a potentially 
powerful signature to exploit in statistical tests of this class of models. 

A graphic demonstration of the importance of phases in patterns gener- 
ally is given in Fig 1. Since the amplitude of each Fourier mode is unchanged 
in the phase reshuffling operation, these two pictures have exactly the same 
power-spectrum, P(k) oc |5(k)| 2 . In fact, they have more than that: they 
have exactly the same amplitudes for all k. They also have totally different 
morphology. The evident shortcomings of P(k) can be partly ameliorated by 
defining higher-order quantities such as the bispectrum |3|,[^,|^] or correla- 
tions of S(k) 2 [0. 

3 The Bispectrum and Phase Coupling 

The bispectrum and higher-order polyspectra vanish for Gaussian fields, but 
in a non-Gaussian field they may be non-zero. The usefulness of these and 
related quantities therefore lies in the fact that they encode some information 
about non-linearity and non-Gaussianity. To understand the relationship be- 
tween the bispectrum and Fourier phases, it is very helpful to consider the 
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Fig. 1. Numerical simulation of galaxy clustering (left) together with a version 
generated randomly reshuffling the phases between Fourier modes of the original 
picture (right). 



following toy examples. Imagine a simple density field defined in one spatial 
dimension that consists of the superposition of two cosine components: 

S(x) — A\ cos(Aix + 4>i) + A 2 cos(X 2 x + </> 2 ). (3) 

The generalisation to several spatial dimensions is trivial. The phases 4>i and 
4>2 are random and A\ and A2 are constants. We can simplify the following 
by introducing a new notation 

««-*(};)+*(£). w 

Clearly this example displays no phase correlations. Now consider a new field 
obtained from the example (|^) through the non-linear transformation 

5(x) h-> 5(x) + eS 2 {x), (5) 

where e is a constant parameter. Equation (^) may be thought of as a very 
phcnomcnological representation of a perturbation series, with e controlling 
the level of non-linearity. Using the same notation as equation (|J), the new 
field S(x) can be written 

where the Bi are constants obtained from the Ai. Notice in equation (||) that 
the phases follow the same kind of harmonic relationship as the wavenum- 
bers. This form of phase association is termed quadratic phase coupling. It is 
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this form of phase relationship that appears in the bispectrum. To see this, 
consider another two toy examples. First, model A, 

in which A3 = Ai + A2 but in which <f>\, cj>2 and 03 are random; and 

^1 \ , I A 2 \ 1 / A3 = A x + A2 



fe(»)= : +2 + 2-7': • w 



2 7 V 03 = 01 + 02 



Model A exhibits no phase association; model B displays quadratic phase 
coupling. It is straightforward to show that (5a) = (5b) = 0. The autoco- 
variances are equal: 

U(r) = (5 A (x)5 A (x + r)> = £s(r) = -[oos(A x r) + cos(A 2 r) + cos(A 3 r)], (9) 

as are the power spectra, demonstrating that second-order statistics are blind 
to phase association. 

The (reduced) three-point autocovariance function is 

C(n,r 2 ) = (S(x)S(x + ri )S(x + r 2 )>. (10) 

For model A we get 

Ct(ri,r 2 ) = 0, (11) 

whereas for model B it is 

(B(ri,r 2 ) = - [cos(A 2 ri + Air 2 ) + cos(A 3 ri - Air 2 ) + cos(Airi + A 2 r 2 ) 

+ cos(A 3 r! - A 2 r 2 ) + cos(Air! - A 3 r 2 ) + cos(A 2 r! - A 3 r 2 )}12) 

The bispectrum, B(k\ , fc 2 ), is defined as the two-dimensional Fourier trans- 
form of C, so Byt^i, fc 2 ) = trivially, whereas Bs(fci, fc 2 ) consists of a single 
spike located somewhere in the region of (fci,fc 2 ) space defined by fc 2 > 0, 
k\ > fc 2 and k\ + fc 2 < n. If Ai > A 2 then the spike appears at k\ = Ai, 
^2 = A 2 )- Thus the bispectrum measures the phase coupling induced by 
quadratic nonlinearities. To reinstate the phase information order-by-order 
requires an infinite hierarchy of polyspectra. 

An alternative way of looking at this issue is to note that the information 
needed to fully specify a non-Gaussian field to arbitrary order (or, in a wider 
context, the information needed to define an image resides in the complete 
set of Fourier phases Q . Unfortunately, relatively little is known about the 
behaviour of Fourier phases in the nonlinear regime of gravitational clus- 
tering 12, il^ , p^ , |i"5 16 17p , but it is of great importance to understand phase 
correlations in order to design efficient statistical tools for the analysis of 
clustering data. 



Phase Information 5 



4 Visualizing and Quantifying Phase Information 

A vital first step on the road to a useful quantitative description of phase in- 



formation is to represent it visually) 17 . In colour image display devices, each 



pixel represents the intensity and colour at that position in the image |18|,|19| . 
The quantitative specification of colour involves three coordinates describing 
the location of that pixel in an abstract colour space, designed to reflect as 
accurately as possible the eye's response to light of different wavelengths. In 
many devices this colour space is defined in terms of the amount of Red, 
Green or Blue required to construct the appropriate tone; hence the RGB 
colour scheme. The scheme we are particularly interested in is based on three 
different parameters: Hue, Saturation and Brightness. Hue is the term used 
to distinguish between different basic colours (blue, yellow, red and so on). 
Saturation refers to the purity of the colour, defined by how much white is 
mixed with it. A saturated red hue would be a very bright red, whereas a less 
saturated red would be pink. Brightness indicates the overall intensity of the 
pixel on a grey scale. The HSB colour model is particularly useful because of 
the properties of the 'hue' parameter, which is defined as a circular variable. 
If the Fourier transform of a density map has real part R and imaginary part 
/ then the phase for each wavenumber, given by <p — arctan(//i?), can be 
represented as a hue for that pixel using the colour circle [0 . 

The pattern of phase information revealed by this method related to the 
gravitational dynamics of its origin. For example in our analysis of phase 
coupling 0] we introduced a quantity Dk, defined by 

Dk = 4>k+i - 4>k, (13) 

which measures the difference in phase of modes with neighbouring wavenum- 
bers in one dimension. We refer to Dk as the phase gradient. To apply this 
idea to a two-dimensional simulation we simply calculate gradients in the 
x and y directions independently. Since the difference between two circular 
random variables is itself a circular random variable, the distribution of Dk 
should initially be uniform. As the fluctuations evolve waves begin to collapse, 
spawning higher- frequency modes in phase with the original [EQ] . These then 
interact with other waves to produce a non- uniform distribution of Dk- For 
examples, see 

\protect\vrule widthOpt\protect\href {http : //www . nott ingham . ac . uk\str ing~ppzpc/phases/ index . 

It is necessary to develop quantitative measures of phase information that 
can describe the structure displayed in the colour representations. In the be- 
ginning the phases cf>k are random and so are the Dk obtained from them. 
This corresponds to a state of minimal information, or in other words maxi- 
mum entropy. As information flows into the phases the information content 
must increase and the entropy decrease. One way to quantify this is by defin- 
ing an information entropy on the set of phase gradients. One constructs a 
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frequency distribution, f(D) of the values of Dk obtained from the whole 
map. The entropy is then defined as 



where the integral is taken over all values of D, i.e. from to 2ir. The use of D, 
rather than <ft itself, to define entropy is one way of accounting for the lack of 
translation invariance of </>, a problem that was missed in previous attempts 
to quantify phase entropy pi) . A uniform distribution of D is a state of 
maximum entropy (minimum information), corresponding to Gaussian ini- 
tial conditions (random phases). This maximal value of 5 max = log(27r) is 
a characteristic of Gaussian fields. As the system evolves it moves into to 
states of greater information content (i.e. lower entropy). The scaling of S 
with clustering growth displays interesting properties Q , establishing an im- 
portant link between the spatial pattern and the physics driving clustering 
growth. 

5 Discussion 

In fairly recent history, cosmological data sets were sparse and incomplete, 
and the statistical methods deployed to analyse them were crude. Second- 
order statistics, such as P(k) and £(r), are blunt instruments that throw 
away the fine details of the delicate pattern of cosmic structure. These de- 
tails lie in the distribution of Fourier phases to which second-order statistics 
arc blind. It would not do justice to massively improved data if effort were 
directed only to better estimates of these quantities. Moreover, as we have 
shown, phase information provides a unique fingerprint of gravitational insta- 
bility developed from Gaussian initial conditions (which have maximal phase 
entropy). Methods such as those we have described above can therefore be 
used to test this standard paradigm for structure formation. They can also 
furnish direct tests of the presence of initial non-Gaussianity [p2|j2^ , p4]| . As 
the raw material is increasing in both quality and quantity, it is time to refine 
our statistical technology so that the subtle and precious artifacts previously 
ignored can be both detected and extracted. 
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